Abstract:Acquiring semantic information in the surrounding environment is an important task of semantic simultaneous localization and mapping(SLAM). However, the time performance of the system is affected by semantic segmentation or instance segmentation, and the accuracy of the system is reduced while adopting object detection methods. Therefore, a pixel?level segmentation algorithm combining depth map clustering and object detection is proposed in this paper. The positioning accuracy of the current semantic SLAM system is improved with the real?time performance of the system guaranteed. Firstly, the mean filtering algorithm is utilized to repair the invalid points of the depth map and thus the depth information is more reliable. Secondly, object detection is performed on RGB images and K?means clustering is employed for corresponding depth maps, and then the pixel?level object segmentation result is obtained by combining the two results. Finally, the dynamic points in the surrounding environment are eliminated by the results described above, and a complete semantic map without dynamic objects is established. Experiments of depth map restoration, pixel?level segmentation, and comparison between the estimated camera trajectory and the real camera trajectory are carried out on TUM dataset and real home scenes. The experimental results show that the proposed algorithm exhibits good real?time performance and robustness.
[1] CADENA C, CARLONE L, CARRILLO H, et al. Past, Present, and Future of Simultaneous Localization and Mapping: Toward the Robust-Perception Age. IEEE Transactions on Robotics, 2016, 32(6): 1309-1332. [2] FUENTES-PACHECO J, RUIZ-ASCENCIO J, RENDN-MANCHA J M. Visual Simultaneous Localization and Mapping: A Survey. Artificial Intelligence Review, 2015, 43(1): 55-81. [3] MUR-ARTAL R, TARDS J D. ORB-SLAM2: An Open-Source SLAM System for Monocular, Stereo, and RGB-D Cameras. IEEE Transactions on Robotics, 2017, 33(5): 1255-1262. [4] CAMPOS C, ELVIRA R, RODRÍGUEZ J J G, et al. ORB-SLAM3: An Accurate Open-Source Library for Visual, Visual-Inertial, and Multimap SLAM. IEEE Transactions on Robotics, 2021. DOI: 10.1109/TRO.2021.3075644. [5] 徐浩楠,余雷,费树岷.基于半直接法SLAM的大场景稠密三维重建系统.模式识别与人工智能, 2018, 31(5): 477-484. (XU H N, YU L, FEI S M.Large Scene Dense 3D Reconstruction System Based on Semi-Direct SLAM Method. Pattern Recognition and Artificial Intelligence, 2018, 31(5): 477-484.) [6] SUALEH M, KIM G W.Simultaneous Localization and Mapping in the Epoch of Semantics: A Survey. International Journal of Control, Automation and Systems, 2019, 17(3): 729-742. [7] YU C, LIU Z X, LIU X J, et al. DS-SLAM: A Semantic Visual SLAM Towards Dynamic Environments//Proc of the IEEE/RSJ International Conference on Intelligent Robots and Systems. Washington, USA: IEEE, 2018: 1168-1174. [8] BADRINARAYANAN V, KENDALL A, CIPOLLA R.SegNet: A Deep Convolutional Encoder-Decoder Architecture for Image Segmentation. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2017, 39(12): 2481-2495. [9] BESCOS B, FCIL J M, CIVERA J, et al. DynaSLAM: Tracking, Mapping, and Inpainting in Dynamic Scenes. IEEE Robotics and Automation Letters, 2018, 3(4): 4076-4083. [10] BESCOS B, CAMPOS C, TARDS J D, et al. DynaSLAM II: Tightly-Coupled Multi-object Tracking and SLAM. IEEE Robotics and Automation Letters, 2021, 6(3): 5191-5198. [11] ZHONG F W, WANG S, ZHANG Z Q, et al. Detect-SLAM: Making Object Detection and SLAM Mutually Beneficial//Proc of the IEEE Winter Conference on Applications of Computer Vision. Washington, USA: IEEE, 2018: 1001-1010. [12] LIU W, ANGUELOV D, ERHAN D, et al. SSD: Single Shot Multi-box Detector//Proc of the European Conference on Computer Vision. Berlin, Germany: Springer, 2016: 21-37. [13] 付豪,徐和根,张志明,等.动态环境下基于语义和光流约束的视觉同步定位与地图构建[J/OL]. [2021-03-30]. http://kns.cnki.net/kcms/detail/51.1307.TP.20210329.1507.002.html. (FU H, XU H G, ZHANG Z M, et al. Semantic and Optical Flow Constraints Visual SLAM for Dynamic Scenes[J/OL]. [2021-03-30]. http://kns.cnki.net/kcms/detail/51.1307.TP.20210329.1507.002.html [14] FAN Y C, ZHANG Q C, LIU S F, et al. Semantic SLAM with More Accurate Point Cloud Map in Dynamic Environments. IEEE Access, 2020, 8: 112237-112252. [15] DVORNIK N, SHMELKOV K, MAIRAL J, et al. BlitzNet: A Real-Time Deep Network for Scene Understanding//Proc of the IEEE International Conference on Computer Vision. Washington, USA: IEEE, 2017: 4154-4162. [16] QI X X, YANG S W, YAN Y J.Deep Learning Based Semantic Labelling of 3D Point Cloud in Visual SLAM. IOP Conference Series(Materials Science and Engineering), 2018, 428(1). DOI: 10.1088/1757-899X/428/1/012023. [17] REDMON J, FARHADI A.YOLOv3: An Incremental Improvement[C/OL]. [2021-03-30].https://arxiv.org/pdf/1804.02767.pdf. [18] LANGE R, SEITZ P.Solid-State Time-of-Flight Range Camera. IEEE Journal of Quantum Electronics, 2001, 37(3): 390-397. [19] WANG R Z, WAN W H, WANG Y K, et al. A New RGB-D SLAM Method with Moving Object Detection for Dynamic Indoor Scenes. Remote Sensing, 2019, 11(10). DOI: 10.3390/rs1110 1143. [20] XIE W F, LIU P X, ZHENG M H.Moving Object Segmentation and Detection for Robust RGBD-SLAM in Dynamic Environments. IEEE Transactions on Instrumentation and Measurement, 2020, 70. DOI: 10.1109/TIM.2020.3026803. [21] DAI W C, ZHANG Y, LI P, et al. RGB-D SLAM in Dynamic Environments Using Point Correlations. IEEE Transactions on Pattern Analysis and Machine Intelligence, 2021. DOI: 10.1109/TPAMI.2020.3010942. [22] YANG X, YUAN Z K, ZHU D F, et al. Robust and Efficient RGB-D SLAM in Dynamic Environments. IEEE Transactions on Multimedia, 2021. DOI: 10.1109/TMM.2020.3038323. [23] FANG B F, MEI G F, YUAN X H, et al. Visual SLAM for Robot Navigation in Healthcare Facility. Pattern Recognition, 2021, 113. DOI: 10.1016/j.patcog.2021.107822.